Graphene wetting benchmark by dwl38 · Pull Request #333 · ddmms/ml-peg

dwl38 · 2026-01-31T19:21:21Z

Pre-review checklist for PR author

PR author must check the checkboxes below when creating the PR.

I've confirmed the contribution guidelines.

Summary

Added a new benchmark based on the adsorption energy curves of a single water molecule (at various orientations) on a sheet of graphene (under various strain conditions), which is useful for understanding nanoscale wetting. Reference calculations based on PBE functional, calculated using FHI-aims on "intermediate" settings.

Three metrics of equal weight:

MAE of all single-point calculations (across all orientations, strains, and distances)
MAE of binding energies (across all orientations and strains), by comparing fitted adsorption energy curves
MAE of binding lengths (across all orientations and strains), from the same method

Linked issue

Resolves #292

Progress

Calculations
Analysis
Application
Documentation

Testing

Benchmark tested on all currently-implemented models, i.e.:

mace-mp-0a
mace-mp-0b3
mace-mpa-0
mace-omat-0
mace-matpes-r2scan
orb-v3-consv-inf-omat
pet-mad

with no issues. At current time of writing, mace-mp-0a performs the best on all metrics, whereas pet-mad completely fails to produce physically-reasonable adsorption energy curves.

New decorators/callbacks

Added a new "plot_from_scatter" callback, which implements almost identical functionality to "struct_from_scatter" except that it renders a Plotly Graph object instead.

dwl38 · 2026-01-31T19:22:30Z

Database file:
graphene_wetting_under_strain.zip

joehart2001 · 2026-02-03T22:20:55Z

Looking super good overall! The key thing i've noticed is the processed_data function in analysis. The issue with this is that as we have the decorators inside the funciton, if one model has no data, the script will fail. so i think we need to think about how to split it up so that its more robust

update: this is an us problem not a you problem

ElliottKasoar

Thanks for this, it's looking really nice! I've left a few mostly minor comments.

My main suggestions/questions revolve around the app callbacks. I think what do have works really well, but if possible it would be great to integrate any modifications you need into the existing helper functions we have.

Also, in terms of visualisation, I wonder how tricky it would be to view the structure as a trajectory vs distance, a bit like our live NEBs example - so if you click on a point, it shows the structure at that distance, but you can also play the trajectory as the distance changes.

ElliottKasoar · 2026-02-18T17:35:06Z

ml_peg/calcs/surfaces/graphene_wetting_under_strain/calc_graphene_wetting_under_strain.py

+    global DATABASE_INFO_SAVED
+    if not DATABASE_INFO_SAVED:
+        OUT_PATH.mkdir(parents=True, exist_ok=True)
+        database_info_path = OUT_PATH / "database_info.yml"


Can you not just check if (OUT_PATH / "database_info.yml").exists() ?

ElliottKasoar · 2026-02-18T17:36:15Z

ml_peg/calcs/surfaces/graphene_wetting_under_strain/calc_graphene_wetting_under_strain.py

+    water_energy = atoms.get_potential_energy()
+
+    # Iterate through strain conditions
+    for strain in strains:


It's fairly minor since it's so short, but it might be nice to use tqdm for this, just so we can see progress

For slower models, it can take a minute or more, and since it's very reasonable to run this locally, that might be concerning

ElliottKasoar · 2026-02-18T17:55:20Z

ml_peg/analysis/surfaces/graphene_wetting_under_strain/analyse_graphene_wetting_under_strain.py

+                @plot_scatter(
+                    filename=OUT_PATH / model / f"figure_{orientation}_{strain}.json",
+                    title=f"{orientation} binding energy curve ({strain[1:5]}% strain)",
+                    x_label="Distance / Å",
+                    y_label="Adsorption energy / meV",
+                    show_line=True,
+                )
+                def plot_model_binding_energy_curve(
+                    model, orientation, strain
+                ) -> dict[str, tuple[list[float], list[float]]]:
+                    return {
+                        "ref": (
+                            results["distances"],
+                            results["ref"][orientation][strain]["energies"],
+                        ),
+                        model: (
+                            results["distances"],
+                            results[model][orientation][strain]["energies"],
+                        ),
+                    }


I would probably move this and the other plots out of this function, probably just to a top-level function that is called here.

You may still need to wrap it in a function to parameterise the plot_scatter decorators, but it makes processed_data a bit less unwieldy

ElliottKasoar · 2026-02-18T17:57:30Z

ml_peg/analysis/surfaces/graphene_wetting_under_strain/analyse_graphene_wetting_under_strain.py

+                for i in range(len(processed_data["distances"])):
+                    deviations.append(
+                        abs(
+                            processed_data[model][orientation][strain]["energies"][i]
+                            - processed_data["ref"][orientation][strain]["energies"][i]
+                        )
+                    )


Can this be rewritten to use mae from analysis.utils? This is absolutely fine, but we intend at some point to make changes such that RMSE etc. are computed alongside MAE in a swappable way, and so it would be simpler if we're able to swap about a single function.

ElliottKasoar · 2026-02-18T17:59:04Z

ml_peg/analysis/surfaces/graphene_wetting_under_strain/analyse_graphene_wetting_under_strain.py

+                    )
+                )
+        results[model] = np.nan_to_num(
+            np.mean(deviations), nan=99999, posinf=99999, neginf=99999


For now, if a model fails I'd probably return None. We have plans to translate something like np.inf to the 0 score, but None would probably be most consistent, if it works ok here.

ElliottKasoar · 2026-02-18T17:59:15Z

ml_peg/analysis/surfaces/graphene_wetting_under_strain/analyse_graphene_wetting_under_strain.py

+                    )
+                )
+        results[model] = np.nan_to_num(
+            np.mean(deviations), nan=999, posinf=999, neginf=999


ElliottKasoar · 2026-02-18T18:00:45Z

ml_peg/analysis/surfaces/graphene_wetting_under_strain/metrics.yml

+    good: 40.0
+    bad: 1000.0
+    unit: meV
+    weight: 1.0
+    tooltip: Mean Absolute Error across all orientations, distances, and strains
+    level_of_theory: PBE
+  Binding Energies MAE:
+    good: 40.0
+    bad: 1000.0
+    unit: meV
+    weight: 1.0
+    tooltip: Mean Absolute Error of binding energies across all orientations and strains
+    level_of_theory: PBE
+  Binding Lengths MAE:
+    good: 0.0
+    bad: 1.0


Have you thought about these good and bad thresholds (genuine question)?

1 eV seems quite large to me

ElliottKasoar · 2026-02-18T18:02:23Z

ml_peg/app/surfaces/graphene_wetting_under_strain/app_graphene_wetting_under_strain.py

+        )
+
+
+def struct_from_scatter_custom(scatter_id, struct_id, structs):


What does the normal struct_from_scatter function not do that you need?

ElliottKasoar · 2026-02-18T18:04:27Z

ml_peg/app/surfaces/graphene_wetting_under_strain/app_graphene_wetting_under_strain.py

+        )
+
+
+def plot_and_struct_from_scatter(scatter_id, plot_id, plots_list, struct_id, structs):


I'm not sure I understand why this is needed in addition to the other callbacks we have/you have added?

ElliottKasoar · 2026-02-18T18:05:41Z

ml_peg/app/surfaces/graphene_wetting_under_strain/app_graphene_wetting_under_strain.py

+            return (
+                Div("Click on a metric to view plot."),
+                Div("Click on a metric to view plot."),
+                Div("Click on a metric to view the structure."),
+            )


Why do we want all of these displayed, and what does this do that we can't do with existing callback helpers?

dwl38 added 2 commits January 30, 2026 18:14

Added new callback for plot_from_struct

c620594

Added graphene wetting under strain benchmark

def095c

dwl38 mentioned this pull request Jan 31, 2026

Graphene Wetting Under Strain #292

Open

update logic for database info yml -> include in s3

ab24795

ElliottKasoar added the new benchmark Proposals and suggestions for new benchmarks label Feb 6, 2026

ElliottKasoar self-requested a review February 18, 2026 15:49

ElliottKasoar reviewed Feb 18, 2026

View reviewed changes

		)


		def struct_from_scatter_custom(scatter_id, struct_id, structs):

		)


		def plot_and_struct_from_scatter(scatter_id, plot_id, plots_list, struct_id, structs):

Conversation

dwl38 commented Jan 31, 2026 • edited by joehart2001 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pre-review checklist for PR author

Summary

Linked issue

Progress

Testing

New decorators/callbacks

Uh oh!

dwl38 commented Jan 31, 2026

Uh oh!

joehart2001 commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ElliottKasoar left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dwl38 commented Jan 31, 2026 •

edited by joehart2001

Loading

joehart2001 commented Feb 3, 2026 •

edited

Loading